Goto

Collaborating Authors

 cartoon character


Imagine for Me: Creative Conceptual Blending of Real Images and Text via Blended Attention

Cho, Wonwoong, Zhang, Yanxia, Chen, Yan-Ying, Inouye, David I.

arXiv.org Artificial Intelligence

Blending visual and textual concepts into a new visual concept is a unique and powerful trait of human beings that can fuel creativity. However, in practice, cross-modal conceptual blending for humans is prone to cognitive biases, like design fixation, which leads to local minima in the design space. In this paper, we propose a T2I diffusion adapter "IT-Blender" that can automate the blending process to enhance human creativity. Prior works related to cross-modal conceptual blending are limited in encoding a real image without loss of details or in disentangling the image and text inputs. To address these gaps, IT-Blender leverages pretrained diffusion models (SD and FLUX) to blend the latent representations of a clean reference image with those of the noisy generated image. Combined with our novel blended attention, IT-Blender encodes the real reference image without loss of details and blends the visual concept with the object specified by the text in a disentangled way. Our experiment results show that IT-Blender outperforms the baselines by a large margin in blending visual and textual concepts, shedding light on the new application of image generative models to augment human creativity.


Automating Evaluation of Diffusion Model Unlearning with (Vision-) Language Model World Knowledge

Yeats, Eric, Hannan, Darryl, Kvinge, Henry, Doster, Timothy, Mahan, Scott

arXiv.org Artificial Intelligence

Machine unlearning (MU) is a promising cost-effective method to cleanse undesired information (generated concepts, biases, or patterns) from foundational diffusion models. While MU is orders of magnitude less costly than retraining a diffusion model without the undesired information, it can be challenging and labor-intensive to prove that the information has been fully removed from the model. Moreover, MU can damage diffusion model performance on surrounding concepts that one would like to retain, making it unclear if the diffusion model is still fit for deployment. We introduce autoeval-dmun, an automated tool which leverages (vision-) language models to thoroughly assess unlearning in diffusion models. Given a target concept, autoeval-dmun extracts structured, relevant world knowledge from the language model to identify nearby concepts which are likely damaged by unlearning and to circumvent unlearning with adversarial prompts. We use our automated tool to evaluate popular diffusion model unlearning methods, revealing that language models (1) impose semantic orderings of nearby concepts which correlate well with unlearning damage and (2) effectively circumvent unlearning with synthetic adversarial prompts.


AI 'Photos' of What Cartoon Characters Would Look Like in Real Life

#artificialintelligence

What would famous animated characters from movies and TV shows look like in real life? One digital artist has created a fascinating series of AI-assisted "portraits" that provide the answers to that question. "Since I discovered artificial intelligence, I've been challenging myself to do things I would never have imagined doing," Diao tells PetaPixel. "With several studies and a lot of practice, I thought it was time to bring some Disney characters to human life." Diao says he grew up watching the Simpsons, Hanna Barbera shows, and Disney animations that made a big impact on his life.


How Ready Player One's FX Team Used Its Own AI To Create OASIS Digital Trends

#artificialintelligence

Ahead of the 91st Academy Awards on Sunday, our Oscar Effects series puts the spotlight on each of the five movies nominated for "Visual Effects," looking at the amazing tricks filmmakers and their effects teams used to make each of these films stand out as visual spectacles. Ernest Cline's 2011 novel Ready Player One was once thought to be un-adaptable with its legions of licensed characters from television, movies, video games, and comic books assembling for a sprawling adventure within a virtual universe known as OASIS. And then along came Steven Spielberg to prove the skeptics wrong. Spielberg's adaptation of Ready Player One not only managed to translate the grand scope of its source material, but it also managed to deliver a film jam-packed with the iconic characters and pop-culture references that made the book so popular among a certain generation of readers. It did so with the help of a talented visual effects team led by four-time Academy Award nominee Roger Guyett, who was tasked with not only building a virtual universe populated by a host of familiar and not-so-familiar characters, but also making sure that the digital avatars of the film's lead characters were capable of conveying just as much emotion as their human counterparts. Digital Trends spoke to Guyett about the experience of bringing Ready Player One to the screen, building virtual universes, and finding genuine emotional depth among more than half a million CG creations.


Beyond Deep Fakes

#artificialintelligence

Researchers at Carnegie Mellon University have devised a way to automatically transform the content of one video into the style of another, making it possible to transfer the facial expressions of comedian John Oliver to those of a cartoon character, or to make a daffodil bloom in much the same way a hibiscus would. Because the data-driven method does not require human intervention, it can rapidly transform large amounts of video, making it a boon to movie production. It can also be used to convert black-and-white films to color and to create content for virtual reality experiences. "I think there are a lot of stories to be told," said Aayush Bansal, a Ph.D. student in CMU's Robotics Institute. Film production was his primary motivation in helping devise the method, he explained, enabling movies to be produced more quickly and cheaply.


Google Allo now transforms your selfies into custom emoji

Daily Mail - Science & tech

Every day, an estimated one million selfies are taken around the world. In a bid to make selfies more exciting, Google has introduced a new feature in its messaging app, Allo. The tool combines AI with the work of artists to turn selfies into custom emoji stickers. Users can snap a quick photo of themselves, and it will automatically be transformed into a cartoon, with customisation options to help personalise the avatar further. The new Google Allo feature combines neural networks with the work of artists to turn selfies into personalised stickers.


Disney Robot Project to Mimic Humans Larry Scheinfeld

#artificialintelligence

Artificial intelligence (AI) and robots in the technology sector are some of the emerging trends of the century, spearheading a revolution where workers may soon be replaced by robots and automated systems. Disney is already jumping into the world of robotics through projects developing at Disney Research, a global network of research labs working on a variety of innovative technologies and automated systems. One of the most recent productions is a set of robotic arms that mimic human movement. Here's a closer look at some of the latest developments in the field of robotics, and what the future may hold for both consumers and companies as AI and robots become more commonplace: Technology that Mimics Human Movement One of the most interesting projects underway at Disney Research labs is a camera-mounted robot named'Jimmy'. This particular robot is designed to stream video content to an operator that is wearing a virtual reality headset.